Label-free morphological sub-population cytometry for sensitive phenotypic screening of heterogenous neural disease model cells

Label-free image analysis has several advantages with respect to the development of drug screening platforms. However, the evaluation of drug-responsive cells based exclusively on morphological information is challenging, especially in cases of morphologically heterogeneous cells or a small subset of drug-responsive cells. We developed a novel label-free cell sub-population analysis method called “in silico FOCUS (in silico analysis of featured-objects concentrated by anomaly discrimination from unit space)” to enable robust phenotypic screening of morphologically heterogeneous spinal and bulbar muscular atrophy (SBMA) model cells. This method with the anomaly discrimination concept can sensitively evaluate drug-responsive cells as morphologically anomalous cells through in silico cytometric analysis. As this algorithm requires only morphological information of control cells for training, no labeling or drug administration experiments are needed. The responses of SBMA model cells to dihydrotestosterone revealed that in silico FOCUS can identify the characteristics of a small sub-population with drug-responsive phenotypes to facilitate robust drug response profiling. The phenotype classification model confirmed with high accuracy the SBMA-rescuing effect of pioglitazone using morphological information alone. In silico FOCUS enables the evaluation of delicate quality transitions in cells that are difficult to profile experimentally, including primary cells or cells with no known markers.


Metabolism measurement
Metabolic measurements (glucose, lactate, glutamic acid (Glu), and glutamine (Gln)) were performed using BioProfile FLEX2 (Nova Biomedical K.K., Tokyo, Japan). Under each culture condition, 400 μL of the culture supernatant was analyzed. Supernatants were collected at days 2 and 3 post-seeding. The consumption or production rate of each component was determined.

Image acquisition
Phase-contrast microscopy images were acquired for cells grown in 24-well plates using an automatic cell image acquisition system (BioStation CT, Nikon Corporation, Tokyo, Japan). Cells were seeded at a density of 2000 cells/cm 2 in triplicate. The cells were cultured for 2 days and allowed to undergo neural differentiation for 2 days. The images were captured at 4× magnification (single point per well, covering 4 mm 2 ; 1000 pixels 2 /image). For each condition, 3-6 replicate images were collected. Each image was set in the center of the well to have minimum disturbance of meniscus and included more than 200 cells.

Unsupervised analysis of the morphological profile data
Morphological similarities among individual data (iDs) and population data (pDs) were visualized using principal coordinate analysis (PCA). For data-segmented visualization, all data were plotted once in the same PCA. The weights of all parameters were saved and used for plotting individual categories of data. Hierarchical clustering (using correlation coefficient with average linkage) was used for categorizing objective experimental conditions, followed by morphological category implementation. To reduce the bias of highly correlated parameters in the clustering, the mean and standard deviation (SD) of the texture parameters were eliminated. All analyses and visualizations were performed using R (version 3.4.1) (R Development Core Team, https://www.rproject.org/).

Sample size effect evaluation
From the control and DHT (20 nM) response condition data, random sampling was repeated to generate 50 datasets with 10, 50, 100, 150, 200, and 250 iDs. In silico FOCUS was applied to the DHT response condition data using the unit space trained with the control data. After the calculation of Mahalanobis distances for every iD in target cells, the Mahalanobis distance distribution data between the control and the target were compared using the Welch's t-test. Among the 50 trials of datasets, the numbers of significances to discriminate target iDs were counted and indicated as the percentage.

Construction of the classification model
To construct the phenotype classification model, the following three types of phenotype data were used: disease data, 20 pDs from disease phenotype of AR-97Q cells responding to DHT; rescued data, 40 pDs from rescued phenotypes of AR-97Q cells responding to PG (1 μM) with DHT; healthy data, 20 pDs from healthy phenotypes of AR-24Q cells responding to DHT. Model A was trained to classify disease and rescued data, while Model B was trained to classify disease and healthy data. The performances of the two models were validated by leave-(all samples from the same well)-out cross-validation (modification of leave-one-out, leaving "pDs from the same well" out). For blind test data for model A, healthy data were used. Meanwhile, rescued data were used for blind test data for model B. LASSO was used for the discrimination model, which was coded using R (version 3.4.1). Fig. S1 Discrimination performance of the bulk assay on 20,000-30,000 model cells.        All the percentages indicate the significant discrimination of the target from control evaluated using Welch's t-test in 50 trials. iD, individual data; in silico FOCUS, in silico analysis of featured-objects concentrated by anomaly discrimination from unit space; DHT, dihydrotestosterone.